Joint multilingual learning for coreference resolution
نویسنده
چکیده
Natural language is a pervasive human skill not yet fully achievable by automated computing systems. The main challenge is understanding how to computationally model both the depth and the breadth of natural languages. In this thesis, I present two probabilistic models that systematically model both the depth and the breadth of natural languages for two different linguistic tasks: syntactic parsing and joint learning of named entity recognition and coreference resolution. The syntactic parsing model outperforms current state-of-the-art models by discovering linguistic information shared across languages at the granular level of a sentence. The coreference resolution system is one of the first attempts at joint multilingual modeling of named entity recognition and coreference resolution with limited linguistic resources. It performs second best on three out of four languages when compared to state-of-the-art systems built with rich linguistic resources. I show that we can simultaneously model both the depth and the breadth of natural languages using the underlying linguistic structure shared across languages. Thesis Supervisor: Peter Szolovits Title: Professor of Computer Science and EngineeringMIT CSAIL Thesis Supervisor: Pierre Zweigenbaum Title: Senior Researcher, CNRS Thesis Supervisor: Özlem Uzuner Title: Associate Professor of Information StudiesSUNY, Albany
منابع مشابه
Corpus based coreference resolution for Farsi text
"Coreference resolution" or "finding all expressions that refer to the same entity" in a text, is one of the important requirements in natural language processing. Two words are coreference when both refer to a single entity in the text or the real world. So the main task of coreference resolution systems is to identify terms that refer to a unique entity. A coreference resolution tool could be...
متن کاملMachine Learning for Mention Head Detection in Multilingual Coreference Resolution
This work introduces a machine learning approach to the identification of mention heads needed for multilingual coreference resolution (MCR). We evaluate the method and compare it to a heuristic baseline and a rule-based approach, which are widely used in coreference resolution systems. We use the CoNLL-2012 shared task data sets, which include data for Arabic, Chinese, and English. We show tha...
متن کاملTranslation-Based Projection for Multilingual Coreference Resolution
To build a coreference resolver for a new language, the typical approach is to first coreference-annotate documents from this target language and then train a resolver on these annotated documents using supervised learning techniques. However, the high cost associated with manually coreference-annotating documents needed by a supervised approach makes it difficult to deploy coreference technolo...
متن کاملCorefrence resolution with deep learning in the Persian Labnguage
Coreference resolution is an advanced issue in natural language processing. Nowadays, due to the extension of social networks, TV channels, news agencies, the Internet, etc. in human life, reading all the contents, analyzing them, and finding a relation between them require time and cost. In the present era, text analysis is performed using various natural language processing techniques, one ...
متن کاملLearning to Model Multilingual Unrestricted Coreference in OntoNotes
Coreference resolution, which aims at correctly linking meaningful expressions in text, is a much challenging problem in Natural Language Processing community. This paper describes the multilingual coreference modeling system of Web Information Processing Group, Henan University of Technology, China, for the CoNLL-2012 shared task (closed track). The system takes a supervised learning strategy,...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014